فهرست مطالب

Journal of Artificial Intelligence and Data Mining
Volume:10 Issue: 3, Summer 2022

  • تاریخ انتشار: 1401/07/16
  • تعداد عناوین: 12
|
|
  • M. R. Okhovvat, M. T. Kheirabadi *, A. Nodehi, M. Okhovvat Pages 303-310

    Minimizing make-span and maximizing remaining energy are usually of chief importance in the applications of wireless sensor actor networks (WSANs). Current task assignment approaches are typically concerned with one of the timing or energy constraints. These approaches do not consider the types and various features of tasks WSANs may need to perform and thus may not be applicable to some types of real applications such as search and rescue missions. To this end, an optimized and type aware task assignment approach called TATA is proposed that considers the energy consumption as well as the make-span. TATA is an optimized task assignment approach and aware of the distribution necessities of WSANs with hybrid architecture. TATA comprises of two protocols, namely a Make-span Calculation Protocol (MaSC) and an Energy Consumption Calculation Protocol (ECal). Through considering both time and energy, TATA makes a tradeoff between minimizing make-span and maximizing the residual energies of actors. A series of extensive simulation results on typical scenarios show shorter make-span and larger remaining energy in comparison to when stochastic task assignment (STA), opportunistic load balancing (OLB), and task assignment algorithm based on quasi-Newton interior point (TA-QNIP) approaches is applied.

    Keywords: Energy Consumption, Make-span, Task Assignment, Wireless Sensor Actor Networks
  • A.R. Tajary *, H. Morshedlou Pages 311-320

    With the advent of having many processor cores on a single chip in many-core processors, the demand for exploiting these on-chip resources to boost the performance of applications has been increased. Task mapping is the problem of mapping the application tasks on these processor cores to achieve lower latency and better performance. Many researches are focused on minimizing the path between the tasks that demand high bandwidth for communication. Although using these methods can result in lower latency, but at the same time, it is possible to create congestion in the network which lowers the network throughput. In this paper, a throughput-aware method is proposed that uses simulated annealing for task mapping. The method is checked on several real-world applications and simulations are conducted on a cycle-accurate network on chip simulator. The results illustrate that the proposed method can achieve higher throughput while maintaining the delay in the NoC.

    Keywords: Simulate annealing, Manycore processors, Task mapping
  • Z. Falahiazar, A.R. Bagheri *, M. Reshadi Pages 321-332

    Spatio-temporal (ST) clustering is a relatively new field in data mining with great popularity, especially in geographic information. Moving objects are a type of ST data where the available information on these objects includes their last position. The strategy of performing the clustering operation on all-time sequences is used for clustering moving objects. The problem with density-based clustering, which uses this strategy, is that the density of clusters may change at any point in time because of the displacement of points. Hence, the input parameters of an algorithm like DBSCAN used to cluster moving objects will change and have to be determined again. The DBSCAN-based methods have been proposed so far, assuming that the value of input parameters is fixed over time and does not provide a solution for their automatic determination. Nonetheless, with the objects moving and the density of the clusters changing, these parameters have to be determined appropriately again at each time interval. The paper used a dynamic multi-objective genetic algorithm to determine the parameters of the DBSCAN algorithm dynamically and automatically to solve this problem. The proposed algorithm in each time interval uses the clustering information of the previous time interval to determine the parameters. Beijing traffic control data was used as a moving dataset to evaluate the proposed algorithm. The experiments show that using the proposed algorithm for dynamic determination of DBSCAN input parameters outperforms DBSCAN with fixed input parameters over time in terms of the Silhouette and Outlier indices.

    Keywords: Density-based Clustering, DBSCAN, Dynamic Multi-Objective Optimization, Clustering Moving Objects, Cluster Validity Index
  • N. Elyasi *, M. Hosseini Moghadam Pages 333-344

    In this paper, we use the topological data analysis (TDA) mapper algorithm alongside a deep convolutional neural network in order to classify some medical images.Deep learning models and convolutional neural networks can capture the Euclidean relation of a data point with its neighbor data points like the pixels of an image and they are particularly good at modeling data structures that live in the Euclidean space and not effective at modeling data structures that live in the non-Euclidean spaces. Topological data analysis-based methods have the ability to not only extract the Euclidean, but also topological features of data.For the first time in this paper, we apply a neural network as one of the filter steps of the Kepler mapper algorithm to classify skin cancer images. The major advantage of this method is that Kepler Mapper visualizes the classification result by a simplicial complex, where neural network increases the accuracy of classification. Furthermore, we apply TDA mapper and persistent homology algorithms to analyze the layers of Xception network in different training epochs. Also, we use persistent diagrams to visualize the results of the analysis of layers of the Xception network and then compare them by Wasserstein distances.

    Keywords: Topological Data analysis, Mapper, Persistent Homology, Neural network
  • S. Sareminia * Pages 345-360

    In recent years, the occurrence of various pandemics (COVID-19, SARS, etc.) and their widespread impact on human life have led researchers to focus on their pathology and epidemiology components. One of the most significant inconveniences of these epidemics is the human mortality rate, which has highly social adverse effects. This study, in addition to major attributes affecting the COVID-19 mortality rate (Health factors, people-health status, and climate) considers the social and economic components of societies. These components have been extracted from the countries’ Human Development Index (HDI) and the effect of the level of social development on the mortality rate has been investigated using ensemble data mining methods. The results indicate that the level of community education has the highest effect on the disease mortality rate. In a way, the extent of its effect is much higher than environmental factors such as air temperature, regional health factors, and community welfare. This factor is probably due to the ability of knowledge-based societies to manage the crises, their attention to health advisories, lower involvement of rumors, and consequently lower incidence of mental health problems. This study shows the impact of education on reducing the severity of the crisis in communities and opens a new window in terms of cultural and social factors in the interpretation of medical data. Furthermore, according to the results and comparing different types of single and ensemble data mining methods, the application of the ensemble method in terms of classification accuracy and prediction error has the best result.

    Keywords: Coronavirus Disease (COVID-19), pandemics, Ensemble Data mining methods, HID Index
  • Seyedeh R. Mahmudi Nezhad Dezfouli *, Y. Kyani, Seyed A. Mahmoudinejad Dezfouli Pages 361-372

    Due to the small size, low contrast, variable position, shape, and texture of multiple sclerosis lesions, one of the challenges of medical image processing is the automatic diagnosis and segmentation of multiple sclerosis lesions in Magnetic resonance images. Early diagnosis of these lesions in the first stages of the disease can effectively diagnose and evaluate treatment. Also, automated segmentation is a powerful tool to assist professionals in improving the accuracy of disease diagnosis. This study uses modified adaptive multi-level conditional random fields and the artificial neural network to segment and diagnose multiple sclerosis lesions. Instead of assuming model coefficients as constant, they are considered variables in multi-level statistical models. This study aimed to evaluate the probability of lesions based on the severity, texture, and adjacent areas. The proposed method is applied to 130 MR images of multiple sclerosis patients in two test stages and resulted in 98% precision. Also, the proposed method has reduced the error detection rate by correcting the lesion boundaries using the average intensity of neighborhoods, rotation invariant, and texture for very small voxels with a size of 3-5 voxels, and it has shown very few false-positive lesions. The proposed model resulted in a high sensitivity of 91% with a false positive average of 0.5.

    Keywords: Image segmentation, Automatic Detection, Multiple Sclerosis, Adaptive Multi-Level Conditional Random Fields (AMCRF), Artificial Neural Network
  • P. Kavehzadeh, M. M. Abdollah Pour, S. Momtazi * Pages 373-383

    Over the last few years, text chunking has taken a significant part in sequence labeling tasks. Although a large variety of methods have been proposed for shallow parsing in English, most proposed approaches for text chunking in Persian language are based on simple and traditional concepts. In this paper, we propose using the state-of-the-art transformer-based contextualized models, namely BERT and XLM-RoBERTa, as the major structure of our models. Conditional Random Field (CRF), the combination of Bidirectional Long Short-Term Memory (BiLSTM) and CRF, and a simple dense layer are employed after the transformer-based models to enhance the model's performance in predicting chunk labels. Moreover, we provide a new dataset for noun phrase chunking in Persian which includes annotated data of Persian news text. Our experiments reveal that XLM-RoBERTa achieves the best performance between all the architectures tried on the proposed dataset. The results also show that using a single CRF layer would yield better results than a dense layer and even the combination of BiLSTM and CRF.

    Keywords: Persian text chunking, sequence labeling, deep learning, contextualized word representation
  • S. Ghandibidgoli *, H. Mokhtari Pages 385-400

    In many applications of the robotics, the mobile robot should be guided from a source to a specific destination. The automatic control and guidance of a mobile robot is a challenge in the context of robotics. So, in current paper, this problem is studied using various machine learning methods. Controlling a mobile robot is to help it to make the right decision about changing direction according to the information read by the sensors mounted around waist of the robot. Machine learning methods are trained using 3 large datasets read by the sensors and obtained from machine learning database of UCI. The employed methods include (i) discriminators: greedy hypercube classifier and support vector machines, (ii) parametric approaches: Naive Bayes’ classifier with and without dimensionality reduction methods, (iii) semiparametric algorithms: Expectation-Maximization algorithm (EM), C-means, K-means, agglomerative clustering, (iv) nonparametric approaches for defining the density function: histogram and kernel estimators, (v) nonparametric approaches for learning: k-nearest neighbors and decision tree and (vi) Combining Multiple Learners: Boosting and Bagging. These methods are compared based on various metrics. Computational results indicate superior performance of the implemented methods compared to the previous methods using the mentioned dataset. In general, Boosting, Bagging, Unpruned Tree and Pruned Tree (θ = 10-7) have given better results compared to the existing results. Also the efficiency of the implemented decision tree is better than the other employed methods and this method improves the classification precision, TP-rate, FP- rate and MSE of the classes by 0.1%, 0.1%, 0.001% and 0.001%.

    Keywords: guidance of mobile robot, classifier, parametric approach, semiparametric approach, nonparametric approach
  • N. Esfandian *, F. Jahani Bahnamiri, S. Mavaddati Pages 401-409

    This paper proposes a novel method for voice activity detection based on clustering in spectro-temporal domain. In the proposed algorithms, auditory model is used to extract the spectro-temporal features. Gaussian Mixture Model and WK-means clustering methods are used to decrease dimensions of the spectro-temporal space. Moreover, the energy and positions of clusters are used for voice activity detection. Silence/speech is recognized using the attributes of clusters and the updated threshold value in each frame. Having higher energy, the first cluster is used as the main speech section in computation. The efficiency of the proposed method was evaluated for silence/speech discrimination in different noisy conditions. Displacement of clusters in spectro-temporal domain was considered as the criteria to determine robustness of features. According to the results, the proposed method improved the speech/non-speech segmentation rate in comparison to temporal and spectral features in low signal to noise ratios (SNRs).

    Keywords: Spectro-temporal Features, Auditory Model, Gaussian mixture model, WK-means clustering, Voice Activity Detection
  • Z. Imanimehr * Pages 411-422

    Peer-to-peer video streaming has reached great attention during recent years. Video streaming in peer-to-peer networks is a good way to stream video on the Internet due to the high scalability, high video quality, and low bandwidth requirements. In this paper the issue of live video streaming in peer-to-peer networks which contain selfish peers is addressed. To encourage peers to cooperate in video distribution, tokens are used as an internal currency. Tokens are gained by peers when they accept requests from other peers to upload video chunks to them, and tokens are spent when sending requests to other peers to download video chunks from them. To handle the heterogeneity in the bandwidth of peers, the assumption has been made that the video is coded as multi-layered. For each layer the same token has been used, but priced differently per layer. Based on the available token pools, peers can request various qualities. A new token-based incentive mechanism has been proposed, which adapts the admission control policy of peers according to the dynamics of the request submission, request arrival, time to send requests, and bandwidth availability processes. Peer-to-peer requests could arrive at any time, so the continuous Markov Decision Process has been used.

    Keywords: layered video coding, token, incentive, Q-learning, continuous Markov Decision Process
  • N. Shayanfar, V. Derhami *, M. Rezaeian Pages 423-431

    In video prediction it is expected to predict next frame of video by providing a sequence of input frames. Whereas numerous studies exist that tackle frame prediction, suitable performance is not still achieved and therefore the application is an open problem. In this article multiscale processing is studied for video prediction and a new network architecture for multiscale processing is presented. This architecture is in the broad family of autoencoders. It is comprised of an encoder and decoder. A pretrained VGG is used as an encoder that processes a pyramid of input frames at multiple scales simultaneously. The decoder is based on 3D convolutional neurons. The presented architecture is studied by using three different datasets with varying degree of difficulty. In addition, the proposed approach is compared to two conventional autoencoders. It is observed that by using the pretrained network and multiscale processing results in a performant approach.

    Keywords: deep learning, Convolutional autoencoder, Video prediction, multiscale processing
  • V. Ghasemi, A. Ghanbari Sorkhi * Pages 433-447

    Deploying m-connected k-covering (MK) wireless sensor networks (WSNs) is crucial for reliable packet delivery and target coverage. This paper proposes implementing random MK WSNs based on expected m-connected k-covering (EMK) WSNs. We define EMK WSNs as random WSNs mathematically expected to be both m-connected and k-covering. Deploying random EMK WSNs is conducted by deriving a relationship between m-connectivity and k-coverage, together with a lower bound for the required number of nodes. It is shown that EMK WSNs tend to be MK asymptotically. A polynomial worst-case and linear average-case complexity algorithm is presented to turn an EMK WSN into MK in non-asymptotic conditions. The m-connectivity is founded on the concept of support sets to strictly guarantee the existence of m disjoint paths between every node and the sink. The theoretical results are assessed via experiments, and several metaheuristic solutions have been benchmarked to reveal the appropriate size of the generated MK WSNs.

    Keywords: m-connectivity, k-coverage, Wireless sensor networks, support sets